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DETAILED ACTION 

1 . This Office Action is in response to correspondence filed August 6, 2007 in 
reference to application 10/733,995. Claims 1-16 are pending in the application and 
have been examined. 

Response to Amendment 

2. The amendments to the claims filed August 6, 2007 have been accepted and 
considered in this office action. Claim 7 has been amended. 

Response to Arguments 

Applicant's arguments with respect to claims 1, 7, and 13 have been considered 
but are moot in view of the new ground(s) of rejection. 

Claim Objections 

3. Claim 16 is written as dependent of claim 1 . However, the language of the claim 
would, and the fact that these limitations are already covered by claim 6, suggest that 
this claim should in fact be dependent of claim 13 instead, and will be considered as 
such for purposes of examination. Appropriate correction is required. 

4. Claims 8-12 are objected to as they refer to the machine readable storage of 
claim 7, however claim 7 has been amended to read computer program product and not 
machine readable storage. Appropriate correction is required. 
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Claim Rejections - 35 USC § 101 

5. 35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

6. Claims 7-12 are rejected under 35 U.S.C. 101 because the claimed invention is 
directed to non-statutory subject matter. Claim 7 attempts to claim a computer program 
product. However this could be interpreted to be just, considered non-statutory under 
35 U.S.C. 101. Therefore claim 7 is rejected as well as claims 8-12 as being dependent 
of claim 7. 

Claim Rejections - 35 USC § 103 

7. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by frie manner in which the invention was made. 

8. Claims 1, 3, 4, 7, 9 and 10 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Mahajan et al. (US Patent 7,1 17,153) in view of in view of Yuschik 
(US Patent 7,139,706). 

9. Consider claim 1 , Mahajan teaches a method of evaluating the quality of voice 
input recognition by a voice system (figure 2, shows a method for evaluating recognition 
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in a voice system such as figure 1, connected to Wide area Network 173, that could be 
used to access data.), said method comprising the steps of: 

extracting a current grammar (text words in the recognition model to be 
evaluated) from the voice portal (a portion of training text is selected to be spoken 304, 
Figure 3, Column 5 line 11.); 

generating a test input for the current grammar, the test input including a test 
pattern and a set of active grammars for the current grammar (At step 202, a portion of 
training data 304 is spoken by a person 308 to generate a test signal, in order to test the 
recognition models; Column 5 line 11.); 

providing the test input to the voice system (voice recognition system software) 
(The acoustic signal is converted into waveforms by receiver 309 and feature extractor 
310, and the feature vectors are provided to a decoder 312; column 5 lines 13-15.); 

analyzing the test pattern with respect to the set of active grammars with a 
speech recognition engine in the voice system ( At step 204, the predicted sequence of 
speech units is aligned with the actual sequence of speech units from training data 304; 
column 5. line 37.); and 

deriving a measure of quality of recognition for the current grammar (Under one 
embodiment, this objective function is an error function that indicates the degree to 
which the predicted sequence of speech units differs from the actual sequence of 
speech units after the alignment is complete; column 5, lines 44-47.). 

But Mahajan does not specifically teach that the voice system is a voice portal. 



Application/Control Number: 10/733,995 Page 5 

Art Unit: 2626 

In the same field of speech systems, Yuschik teaches that the voice system is a 
voice portal (It is an object of the invention to design and select the vocabulary for a 
voice activated system (portal) column 3, line 7-20. The menus of the portal are shown 
in figure 4.) 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention for a voice portal to be the voice system being tested and developed as 
taught by Yuschik with the testing system of Mahajan in order to allow for voice portals 
to be adapted to the users spoken languages, Yuschik column 2 line 57. 

10. Consider claim 3, Yuschik teaches the method of claim 1, comprising the steps 
of: 

modifying the current grammar to create a modified grammar (word list to be 
used for recognition) if the measure of quality of recognition for the current grammar 
deviates from a pre-determined range (figure 3, step 340 does an acoustic analysis to 
determine similarity in order to reduce recognition error, step 350 selects alternative 
words if necessary, thereby providing a less confusable alternative to the words 
available to be recognized; column 11 line 34- column 13 line 3). 

1 1 . Consider claim 4, Mahajan in view of Yuschik suggests the method of claim 3, 
further comprising the steps of: 

(i) generating a test input for the modified grammar, the test input including a test 
pattern and a set of active grammars for the modified grammar; 
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(ii) providing the test input for the modified grammar to the voice portal; 

(iii) analyzing the test pattern for the modified grammar with respect to the set of 
active grammars corresponding to the modified grammar with the speech recognition 
engine in the voice portal; 

(iv) deriving a measure of quality of recognition of the modified grammar; and 

(v) re-modifying the modified grammar and repeating steps (i) through (iv) until 
the measure of quality of recognition of the modified grammar does not deviate from a 
pre-determined range. 

This is merely reanalyzing the output of the recognizer after the grammar has 
been updated. Figure 3 of Yuschik shows that the acoustical analysis of 340 is 
repeated until the acoustical difference is great enough to allow for accurate speech 
recognition. These analysis steps (i-iv) are the same of claim 1 , which can clearly be 
accomplished by the method of Mahajan as discussed above, and acoustical distance 
would obviously effect the result of the analysis. This step would be useful to determine 
the recognizably of any alternative words entered into the grammar by the modifying 
step, thereby insuring that the change increased the performance of the recognizer. 

12. Consider claim 7, Mahajan teaches a computer program product (figure 1 shows 
memories 141, 151, 152, 155, and 156 capable of storing the computer code) for 
evaluating the quality of voice input recognition by a voice system (figure 2, shows a 
method for evaluating recognition in a voice system such as figure 1 , connected to Wide 
area Network 173, that could be used to access data ), said computer program product 
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comprising computer usable program code including a routine set of instructions which 
when executed by a machine cause the machine to perform the steps of: 

extracting a current grammar (text words in the recognition model to be 
evaluated) from the voice system (a portion of training text is selected to be spoken 304, 
Figure 3, Column 5 line 1 1 .); 

generating a test input for the current grammar, the test input including a test 
pattern and a set of active grammars for the current grammar (At step 202, a portion of 
training data 304 is spoken by a person 308 to generate a test signal; Column 5 line 
11.); 

providing the test input to the voice system (speech recognition system of figure 
1) (The acoustic signal is converted into feature vectors by receiver 309 and feature 
extractor 310, and the feature vectors are provided to a decoder 312; column 5 lines 13- 

15.); 

analyzing the test pattern with respect to the set of active grammars with a 
speech recognition engine in the voice system ( At step 204, the predicted sequence of 
speech units is aligned with the actual sequence of speech units from training data 304; 
column 5. line 37.); and 

deriving a measure of quality of recognition for the current grammar (Under one 
embodiment, this objective function is an error function that indicates the degree to 
which the predicted sequence of speech units differs from the actual sequence of 
speech units after the alignment is complete; column 5, lines 44-47.) 

But Mahajan does not specifically teach that the voice system is a voice portal. 
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In the same field of speech systems, Yuschik teaches that the voice system is a 
voice portal (It is an object of the invention to design and select the vocabulary for a 
voice activated system (portal) column 3, line 7-20. The menus of the portal are shown 
in figure 4.) 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention for a voice portal to be the voice system being tested and developed as 
taught by Yuschik with the testing system of Mahajan in order to allow for voice portals 
to be adapted to the users spoken languages, Yuschik column 2 line 57. 

13. Consider claim 9, Yuschik teaches the machine readable storage of claim 7, 
comprising the steps of: 

modifying the current grammar to create a modified grammar (word list to be 
used for recognition) if the measure of quality of recognition for the current grammar 
deviates from a pre-determined range (figure 3, step 340 does an acoustic analysis to 
determine similarity in order to reduce recognition error, step 350 selects alternative 
words if necessary, thereby providing a less confusable alternative to the words 
available to be recognized; column 1 1 line 34- column 13 line 3). 

14. Consider claim 10, Mahajan in view of Yuschik suggests the computer readable 
storage of claim 9, further comprising the steps of: 

(i) generating a test input for the modified grammar, the test input including a test 
pattern and a set of active grammars for the modified grammar; 
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(ii) providing the test input for the modified grammar to the voice portal; 

(iii) analyzing the test pattern for the modified grammar with respect to the set of 
active grammars corresponding to the modified grammar with the speech recognition 
engine in the voice portal; 

(iv) deriving a measure of quality of recognition of the modified grammar; and 

(v) re-modifying the modified grammar and repeating steps (i) through (iv) until 
the measure of quality of recognition of the modified grammar does not deviate from a 
pre-determined range. 

This is merely reanalyzing the output of the recognizer after the grammar has 
been updated. Figure 3 of Yuschik shows that the acoustical analysis of 340 is 
repeated until the acoustical difference is great enough to allow for accurate speech 
recognition. These analysis steps (i-iv) are the same of claim 1 , which can clearly be 
accomplished by the method of Mahajan as discussed above, and acoustical distance 
would obviously effect the result of the analysis. This step would be useful to determine 
the recognizably of any alternative words entered into the grammar by the modifying 
step, thereby insuring that the change increased the performance of the recognizer. 

15. Claims 2 and 8 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Mahajan in view of Yuschik as applied to claims 1 and 7 above and further in view of 
Reich (US PAP 2002/0173955) . 
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16. Consider claim 2, Mahajan and Yuschik teaches the method of claim 1 , Mahajan 
further teaches including comparing recognition results for the test input with an 
expected value to assess the measure of quality of recognition (Under one embodiment, 
this objective function is an error function that indicates the degree to which the 
predicted sequence of speech units differs from the actual sequence of speech units 
after the alignment is complete; column 5, lines 44-47.). but does not specifically teach 
wherein said deriving step includes the step of deriving a confidence level and a set of 
n-best results for the test input, and further comprising the steps of: 

comparing the confidence level and set of n-best results for the test input with an 
expected value to assess the measure of quality of recognition. 

In the same field of speech recognition, Reich teaches the step of deriving a 
confidence level and a set of n-best results for the test input (figure 4 shows step 420, 
confidence scores are determined, step 460 N best candidates are selected. The N-best 
candidate with the highest confidence level would inherently exceed the expected 
value.), 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the confidence scores and N best candidates as taught by Reich 
and compare them with an expected value to determine a quality of the recognition as 
taught by Mahajan and Yuschik in order to provide a more robust method of evaluating 
a recognition system by allowing one to spot ambiguities in the recognition. 
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Consider claim 8, Mahajan and Yuschik teaches the storage of claim 7, Mahajan 
further teaches including comparing recognition results for the test input with an 
expected value to assess the measure of quality of recognition (Under one embodiment, 
this objective function is an error function that indicates the degree to which the 
predicted sequence of speech units differs from .the actual sequence of speech units 
after the alignment is complete; column 5, lines 44-47.)- but does not specifically teach 
wherein said deriving step includes the step of deriving a confidence level and a set of 
n-best results for the test input, and further comprising the steps of: 

comparing the confidence level and set of n-best results for the test input with an 
expected value to assess the measure of quality of recognition. 

In the same field of speech recognition, Reich teaches the step of deriving a 
confidence level and a set of n-best results for the test input (figure 4 shows step 420, 
confidence scores are determined, step 460 N best candidates are selected. The N- 
best candidate with the highest confidence level would inherently exceed the expected 
value.), 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the confidence scores and N best candidates as taught by Reich 
and compare them with an expected value to determine a quality of the recognition as 
taught by Mahajan and Yuschik in order to provide a more robust method of evaluating 
a recognition system by allowing one to spot ambiguities in the recognition. 
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17. Claims 5, 6, 11-13, 15, and 16 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Mahajan in view of Yuschik as applied to claims 1 and 7 above and 
further in view of Randic (US Patent 6,275,797). 

1 8. Consider claim 5, Mahajan and Yuschik teaches the method of claim 1 , but does 
not specifically teach modifying the test pattern to emulate one or more user voices prior 
to entering the test input into the voice portal. 

In the same field of speech testing, Randic suggests modifying the test pattern to 
emulate one or more user voices prior to entering the test input into the voice portal 
(Figure 1 shows using a voice test file generated by a TTS engine used to test the voice 
path using recognition. This is a similar technique used to test the quality of recognition 
in Mahajan. Using a computer generated voice to generate the test file, Column 3 line 
27, would inherently allow the test pattern to emulate whatever voice the computer 
generation system was configured to produce. Further, it is well known in the art that 
TTS engines can be configured to allow for the generation of multiple voice types, 
although the claim language suggest that just one voice could be used.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the computerized speech generation as taught by Randic in 
place of the human speaker as taught by Mahajan and Yuschik in order to allow the 
speech recognizer to become more flexible through the quality analysis. 
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19. Consider claim 6, Mahajan and Yuschik teaches the method of claim 1 , but does 
not specifically teach modifying the test pattern to emulate the influence of one or more 
communications network qualities prior to entering the test input into the voice portal. 

In the same field of speech testing, Randic teaches modifying the test pattern to 
emulate the influence of one or more communications network qualities prior to entering 
the test input into the voice portal (figure 3 shows passing the voiced speech pattern 
through a transmission scheme in order to evaluate the effect that the voice channel 
has on recognition; column 4, line 31- column 7 line 29.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to combine the analysis of the voice channel as taught by Randic with 
the speech recognition quality evaluation of Mahajan and Yuschik in order to make the 
speech recognizer more robust. 

20. Consider claim 1 1 , Mahajan and Yuschik teaches the computer readable storage 
of claim 7, but does not specifically teach modifying the test pattern to emulate one or 
more user voices prior to entering the test input into the voice portal. 

In the same field of speech testing, Randic teaches modifying the test pattern to 
emulate one or more user voices prior to entering the test input into the voice portal 
(Figure 1 shows using a voice test file generated by a TTS engine used to test the voice 
path using recognition. This is a similar technique used to test the quality of recognition 
in Mahajan. Using a computer generated voice to generate the test file, Column 3 line 
27, would inherently allow the test pattern to emulate whatever voice the computer 
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generation system was configured to produce. Further, it is well known in the art that 
TTS engines can be configured to allow for the generation of multiple voice types, 
although the claim language suggest that just one voice could be used.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the computerized speech generation as taught by Randic in 
place of the human speaker as taught by Mahajan and Yuschik in order to allow for 
more efficient and more accurate quality analysis of the recognizer. 

21 . Consider claim 12, Mahajan and Yuschik teaches the computer readable storage 
of claim 7, but does not specifically teach modifying the test pattern to emulate the 
influence of one or more communications network qualities prior to entering the test 
input into the voice portal. 

In the same field of speech testing, Randic suggests modifying the test pattern to 
emulate the influence of one or more communications network qualities prior to entering 
the test input into the voice portal (figure 3 shows passing the voiced speech pattern 
through a transmission scheme in order to evaluate the effect that the voice channel 
has on recognition; column 4, line 31- column 7 line 29.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to combine the analysis of the voice channel as taught by Randic with 
the speech recognition quality evaluation of Mahajan and Yuschik in order to make the 
speech recognizer more robust. 
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22. Consider claim 13, Mahajan teaches a system for evaluating the quality of voice 
input recognition by a voice system having a speech recognition engine (figure 3), 
comprising: 

an analysis interface for extracting a set of current grammars from the voice 
system a portion of training text is selected to be spoken 304, Figure 3, Column 5 line 

11.); 

a test pattern generator for generating a test input for each current grammar, the 
test input including a test pattern and a set of active grammars corresponding to each 
current grammar (At step 202, a portion of training data 304 is spoken by a person 308 
to generate a test signal; Column 5 line 11.);; 

an apparatus for entering each test pattern into the voice system (At step 202, a 
portion of training data 304 is spoken by a person 308 to generate a test signal; Column 
5 line 11.); 

a results collector for analyzing each test pattern entered into the voice system 
with the speech recognition engine against the set of active grammars corresponding to 
the current grammar for said test pattern ( At step 204, the predicted sequence of 
speech units is aligned with the actual sequence of speech units from training data 304; 
column 5. line 37.); and 

a results analyzer for deriving a set of statistics of a quality of recognition of each 
current grammar (Under one embodiment, this objective function is an error function 
that indicates the degree to which the predicted sequence of speech units differs from 
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the actual sequence of speech units after the alignment is complete; column 5, lines 44- 

47.). 

However Mahajan does not specifically teach that the voice system is a voice 
portal or using a text to speech engine to enter data into the voice porthole. 

In the same field of speech systems, Yuschik teaches that the voice system is a 
voice portal (It is an object of the invention to design and select the vocabulary for a 
voice activated system (portal) column 3, line 7-20. The menus of the portal are shown 
in figure 4 ). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention for a voice portal to be the voice system being tested and developed as 
taught by Yuschik with the testing system of Mahajan in order to allow for voice portals 
to be adapted to the users spoken languages, Yuschik column 2 line 57. 

But Mahajan and Yuschik does not teach specifically using a text to speech 
engine to enter data into the voice porthole. 

In the same field of speech signal testing, Randic teaches using a text to speech 
engine to generate test signals for a system (Figure 1 shows using a voice test file 
generated by a TTS engine used to test the voice path using recognition. This is a 
similar technique used to test the quality of recognition in Mahajan. Using a computer 
generated voice to generate the test file, Column 3 line 27, would inherently allow the 
test pattern to emulate whatever voice the computer generation system was configured 
to produce.). 
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Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the computerized speech generation as taught by Randic in 
place of the human speaker as taught by Mahajan in order to allow for more efficient 
and more comprehensive quality analysis of the recognizer. 

23. Consider claim 15, Mahajan and Yuschik in view of Randic teaches the system of 
claim 13, but does not specifically teach modifying the test pattern to emulate one or 
more user voices prior to entering the test input into the voice portal. 

However Randic teaches modifying the test pattern to emulate one or more user 
voices prior to entering the test input into the voice portal (Figure 1 shows using a voice 
test file generated by a TTS engine used to test the voice path using recognition. This 
is a similar technique used to test the quality of recognition in Mahajan. Using a 
computer generated voice to generate the test file, Column 3 line 27, would inherently 
allow the test pattern to emulate whatever voice the computer generation system was 
configured to produce. Further, it is well known in the art that TTS engines can be 
configured to allow for the generation of multiple voice types, although the claim 
language suggest that just one voice could be used.). 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the computerized speech generation as taught by Randic to 
emulate a user voice in order to allow for more efficient and more accurate quality 
analysis of the recognizer. 
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24. Consider claim 16, Mahajan teaches the system of claim 13, wherein the test 
pattern generator is modified to emulate the influence of one or more communications 
network qualities prior to entering the test input into the voice portal, (figure 3 shows 
passing the voiced speech pattern through a transmission scheme in order to evaluate 
the effect that the voice channel has on recognition; column 4, line 31- column 7 line 
29.). 

25. Claim 14 is rejected under 35 U.S.C. 103(a) as being unpatentable over Mahajan 
in view of Yuschik in view of Randic as applied to claim 13 above, and further in view of 
Reich. 

26. Consider claim 14, Mahajan in view of Randic teaches the system of claim 13, 
including comparing recognition results for the test input with an expected value to 
assess the measure of quality of recognition for each current grammar (Under one 
embodiment, this objective function is an error function that indicates the degree to 
which the predicted sequence of speech units differs from the actual sequence of 
speech units after the alignment is complete; column 5, lines 44-47.). but does not 
specifically teach that the statistics include a confidence level and a set of n-best results 
for each test input. 

In the same field of speech recognition, Reich teaches the step of deriving a 
confidence level and a set of n-best results for the test input (figure 4 shows step 420, 
confidence scores are determined, step 460 N best candidates are selected.), 



Application/Control Number: 10/733,995 Page 19 

Art Unit: 2626 

Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to use the confidence scores and N best candidates as taught by Reich 
and compare them with an expected value to determine a quality of the recognition as 
taught by Mahajan in view of Yuschik in view of Randic in order to provide a more 
robust method of evaluating a recognition system by allowing one to spot ambiguities in 
the recognition. 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Douglas C. Godbold whose telephone number is (571) 
270-1451. The examiner can normally be reached on Monday-Thursday 7:00am- 
4:30pm Friday 7:00am-3:30pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick Edouard can be reached on (571) 272-7603. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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